Pre-computed Algebraic Signatures for Fast String Search, Protection Against Incidental Viewing and Corruption of Data in an SDDS
نویسندگان
چکیده
We propose to encode records of a Scalable Distributed Data Structure (SDDS) using precomputed algebraic signatures. The partly pre-computed algebraic signature of a string encodes each symbol into its contribution to the algebraic signature of the string. The cumulative pre-computed algebraic signature encodes each symbol with the signature of the string prefix ending with the symbol. The encoding/decoding according to either scheme occurs at the SDDS clients. For both schemes, and each operation, the overhead is of linear time complexity O (n). It is however slightly higher for the cumulative signature. The schemes protect the SDDS data against incidental viewing by an unauthorized server’s administrator. One may use them also to detect and localize the silent corruption. These features should be of interest to P2P and grid computing. Both schemes provide also fast string search (match) directly on encoded data at the SDDS servers. They appear an alternative to known Karp-Rabin type schemes in our context of a search in a file or a database. Both accelerate the string search with respect to the fast already use of the algebraic signatures on the original data. Moreover, both appear typically the fastest in the context, among any string search algorithms we are aware of. The cumulative signature provides the fastest searches. For the string of l symbols in a field of n symbols, the complexity is almost O (1) for prefix search, and O (n – l) for the string search. The string manipulation capabilities of our schemes should be by themselves of interest to applications.
منابع مشابه
Cumulative Algebraic Signatures for Fast String Search, Protection Against Incidental Viewing and Corruption of Data in an SDDS
Scalable Distributed Data Structures (SDDS) are a class of data structures for multicomputers (a distributed system of networked computers) that allow data access by key in constant time (independent of the number of nodes in the multicomputer) and parallel search of the data. In order to speed up the parallel search of the data fields of the records, we propose to encode the records of a Scala...
متن کاملPattern Matching Using n-gram Sampling Of Cumulative Algebraic Signatures : Preliminary Results
Extended Abstract We propose a novel string (pattern) matching algorithm called n-gram search. We intend it for the records stored once and searched many times in a database or a file, especially organized into a Scalable Distributed Data Structure, (SDDS), over a grid or a structured P2P net. We presume that the records are encoded into their cumulative algebraic signatures, providing incident...
متن کاملFast String Search Using n-Gram Sampling of Cumulative Algebraic Signatures : Preliminary Results
We propose a novel string (pattern) matching algorithm called n-gram search. Unlike the prominent algorithms, ours does not process the visited string (record) and the pattern directly. We intend it for the records stored once and searched many times in a database or a file, especially organized into a Scalable Distributed Data Structure, (SDDS), over a grid or a structured P2P net. Instead, it...
متن کاملFast nGram-Based String Search Over Data Encoded Using Algebraic Signatures
We propose a novel string search algorithm for data stored once and read many times. Our search method combines the sublinear traversal of the record (as in Boyer Moore or Knuth-Morris-Pratt) with the agglomeration of parts of the record and search pattern into a single character – the algebraic signature – in the manner of Karp-Rabin. Our experiments show that our algorithm is up to seventy ti...
متن کاملNeural Network Based Protection of Software Defined Network Controller against Distributed Denial of Service Attacks
Software Defined Network (SDN) is a new architecture for network management and its main concept is centralizing network management in the network control level that has an overview of the network and determines the forwarding rules for switches and routers (the data level). Although this centralized control is the main advantage of SDN, it is also a single point of failure. If this main contro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005